A Statistical Confidence-Based Adaptive Nearest Neighbor Algorithm for Pattern Classification
نویسندگان
چکیده
The k-nearest neighbor rule is one of the simplest and most attractive pattern classification algorithms. It can be interpreted as an empirical Bayes classifier based on the estimated a posteriori probabilities from the k nearest neighbors. The performance of the k-nearest neighbor rule relies on the locally constant a posteriori probability assumption. This assumption, however, becomes problematic in high dimensional spaces due to the curse of dimensionality. In this paper we introduce a locally adaptive nearest neighbor rule. Instead of using the Euclidean distance to locate the nearest neighbors, the proposed method takes into account the effective influence size of each training example and the statistical confidence with which the label of each training example can be trusted. We test the new method on real-world benchmark datasets and compare it with the standard k-nearest neighbor rule and the support vector machines. The experimental results confirm the effectiveness of the proposed method.
منابع مشابه
Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence
The k-nearest-neighbor rule is one of the most attractive pattern classification algorithms. In practice, the choice of k is determined by the cross-validation method. In this work, we propose a new method for neighborhood size selection that is based on the concept of statistical confidence. We define the confidence associated with a decision that is made by the majority rule from a finite num...
متن کاملLocally Determining the Number of Neighbors in the k-Nearest Neighbor Rule Based on Statistical Confidence
The k-nearest neighbor rule is one of the most attractive pattern classification algorithms. In practice, the value of k is usually determined by the cross-validation method. In this work, we propose a new method that locally determines the number of nearest neighbors based on the concept of statistical confidence. We define the confidence associated with decisions that are made by the majority...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملImproving Nearest Neighbor Rule with a Simple Adaptive Distance Measure
The k-nearest neighbor rule is one of the simplest and most attractive pattern classification algorithms. However, it faces serious challenges when patterns of different classes overlap in some regions in the feature space. In the past, many researchers developed various adaptive or discriminant metrics to improve its performance. In this paper, we demonstrate that an extremely simple adaptive ...
متن کامل